智能论文笔记

Risk assessment and mitigation of e-scooter crashes with naturalistic driving data

Avinash Prabu , Renran Tian , Stanley Chien , Lingxi Li , Yaobin Chen , Rini Sherony

分类：计算机视觉

2022-12-24

Recently, e-scooter-involved crashes have increased significantly but little information is available about the behaviors of on-road e-scooter riders. Most existing e-scooter crash research was based on retrospectively descriptive media reports, emergency room patient records, and crash reports. This paper presents a naturalistic driving study with a focus on e-scooter and vehicle encounters. The goal is to quantitatively measure the behaviors of e-scooter riders in different encounters to help facilitate crash scenario modeling, baseline behavior modeling, and the potential future development of in-vehicle mitigation algorithms. The data was collected using an instrumented vehicle and an e-scooter rider wearable system, respectively. A three-step data analysis process is developed. First, semi-automatic data labeling extracts e-scooter rider images and non-rider human images in similar environments to train an e-scooter-rider classifier. Then, a multi-step scene reconstruction pipeline generates vehicle and e-scooter trajectories in all encounters. The final step is to model e-scooter rider behaviors and e-scooter-vehicle encounter scenarios. A total of 500 vehicle to e-scooter interactions are analyzed. The variables pertaining to the same are also discussed in this paper.

translated by 谷歌翻译

A Wearable Data Collection System for Studying Micro-Level E-Scooter Behavior in Naturalistic Road Environment

Avinash Prabu , Dan Shen , Renran Tian , Stanley Chien , Lingxi Li , Yaobin Chen , Rini Sherony

分类：计算机视觉

2022-12-22

As one of the most popular micro-mobility options, e-scooters are spreading in hundreds of big cities and college towns in the US and worldwide. In the meantime, e-scooters are also posing new challenges to traffic safety. In general, e-scooters are suggested to be ridden in bike lanes/sidewalks or share the road with cars at the maximum speed of about 15-20 mph, which is more flexible and much faster than the pedestrains and bicyclists. These features make e-scooters challenging for human drivers, pedestrians, vehicle active safety modules, and self-driving modules to see and interact. To study this new mobility option and address e-scooter riders' and other road users' safety concerns, this paper proposes a wearable data collection system for investigating the micro-level e-Scooter motion behavior in a Naturalistic road environment. An e-Scooter-based data acquisition system has been developed by integrating LiDAR, cameras, and GPS using the robot operating system (ROS). Software frameworks are developed to support hardware interfaces, sensor operation, sensor synchronization, and data saving. The integrated system can collect data continuously for hours, meeting all the requirements including calibration accuracy and capability of collecting the vehicle and e-Scooter encountering data.

translated by 谷歌翻译

SceNDD: A Scenario-based Naturalistic Driving Dataset

Avinash Prabu , Nitya Ranjan , Lingxi Li , Renran Tian , Stanley Chien , Yaobin Chen , Rini Sherony

分类：机器人

2022-12-22

In this paper, we propose SceNDD: a scenario-based naturalistic driving dataset that is built upon data collected from an instrumented vehicle in downtown Indianapolis. The data collection was completed in 68 driving sessions with different drivers, where each session lasted about 20--40 minutes. The main goal of creating this dataset is to provide the research community with real driving scenarios that have diverse trajectories and driving behaviors. The dataset contains ego-vehicle's waypoints, velocity, yaw angle, as well as non-ego actor's waypoints, velocity, yaw angle, entry-time, and exit-time. Certain flexibility is provided to users so that actors, sensors, lanes, roads, and obstacles can be added to the existing scenarios. We used a Joint Probabilistic Data Association (JPDA) tracker to detect non-ego vehicles on the road. We present some preliminary results of the proposed dataset and a few applications associated with it. The complete dataset is expected to be released by early 2023.

translated by 谷歌翻译

Fine-Grained Semantically Aligned Vision-Language Pre-Training

Juncheng Li , Xin He , Longhui Wei , Long Qian , Linchao Zhu , Lingxi Xie , Yueting Zhuang , Qi Tian , Siliang Tang

分类：计算机视觉

2022-08-04

大规模的视觉预训练在各种下游任务中都表现出了令人印象深刻的进步。现有方法主要是通过图像和文本的全局表示形式的相似性或对图像和文本特征上的高级交叉模式关注来对跨模式对齐进行建模。但是，由于只有全局图像文本对齐信息，因此他们无法明确学习视觉区域和文本短语之间的细粒语义对齐。在本文中，我们介绍了Loupe，这是一种精细的语义一致性视觉语言预训练框架，该框架从新颖的游戏理论互动的角度学习了细粒度的语义对齐。为了有效地计算游戏理论相互作用，我们进一步提出了一种不确定性感知的神经Shapley交互学习模块。实验表明，Loupe在图像文本检索基准测试中实现了最新的。如果没有任何对象级的人类注释和微调，Loupe就可以在对象检测和视觉接地方面实现竞争性能。更重要的是，Loupe从大规模的原始图像文本对学习细粒语义的新方向。

translated by 谷歌翻译

Skeleton-Parted Graph Scattering Networks for 3D Human Motion Prediction

Maosen Li , Siheng Chen , Zijing Zhang , Lingxi Xie , Qi Tian , Ya Zhang

分类：计算机视觉

2022-07-31

基于图形卷积网络的方法对车身连接关系进行建模，最近在基于3D骨架的人体运动预测中显示出巨大的希望。但是，这些方法有两个关键问题：首先，仅在有限的图形频谱中过滤特征，在整个频段中丢失了足够的信息；其次，使用单个图对整个身体进行建模，低估了各个身体部门的各种模式。为了解决第一个问题，我们提出了自适应图散射，该散射利用了多个可训练的带通滤波器将姿势特征分解为较丰富的图形频谱频段。为了解决第二个问题，分别对身体零件进行建模以学习多种动力学，从而沿空间维度提取更精细的特征提取。整合了上述两种设计，我们提出了一个新型的骨架派对图散射网络（SPGSN）。该模型的核心是级联的多部分图形散射块（MPGSB），在不同的身体部门建立自适应图散射，并基于推断的频谱重要性和身体零件相互作用融合分解的特征。广泛的实验表明，SPGSN的表现优于最先进的方法，其优于13.8％，9.3％和2.7％的SPGSN在每个联合位置误差（MPJPE）上，在36m，CMU MOCAP和3DPW Dataset，3D平均位置误差（MPJPE）方面，SPGSN优于最先进的方法。分别。

translated by 谷歌翻译

TAPE: Task-Agnostic Prior Embedding for Image Restoration

Lin Liu , Lingxi Xie , Xiaopeng Zhang , Shanxin Yuan , Xiangyu Chen , Wengang Zhou , Houqiang Li , Qi Tian

分类：计算机视觉

2022-03-11

学习自然图像恢复的一般性先验是一项重要但具有挑战性的任务。早期方法主要涉及手工制作的先验，包括归一化稀疏性，L_0梯度，暗通道先验等。最近，深层神经网络已用于学习各种图像先验，但不能保证概括。在本文中，我们提出了一种新颖的方法，该方法将任务敏捷的先验嵌入到变压器中。我们的方法称为任务不合时宜的先验嵌入（磁带），由两个阶段组成，即，任务不合时宜的预训练和特定于任务的微调，第一阶段将有关自然图像的先验知识嵌入到变压器中，第二阶段嵌入了第二阶段。阶段提取知识以帮助下游图像恢复。对各种降解的实验验证了胶带的有效性。根据PSNR的图像恢复性能提高了多达1.45dB，甚至超过了特定于任务的算法。更重要的是，磁带显示了从退化的图像中解开广义图像先验的能力，这些图像具有良好的转移能力，可以转移到未知的下游任务。

translated by 谷歌翻译

A Structure Feature Algorithm for Multi-modal Forearm Registration

Jiaxin Li , Yan Ding , Weizhong Zhang , Yifan Zhao , Lingxi Guo , Zhe Yang

分类：计算机视觉

2021-11-10

基于图像登记的增强现实技术越来越受欢迎，方便手术前准备和医学教育。本文侧重于前臂图像和数字解剖模型的注册。由于前臂多模态图像的纹理特征的差异，本文提出了一种基于用于前臂的结构兼容的多模态图像登记框架（FFRC）的前臂特征表示曲线（FFRC）。

translated by 谷歌翻译

Understanding and Mitigating Overfitting in Prompt Tuning for Vision-Language Models

Chengcheng Ma , Yang Liu , Jiankang Deng , LingXi Xie , Weiming Dong , Changsheng Xu

分类：计算机视觉

2022-11-04

Pre-trained Vision-Language Models (VLMs) such as CLIP have shown impressive generalization capability in downstream vision tasks with appropriate text prompts. Instead of designing prompts manually, Context Optimization (CoOp) has been recently proposed to learn continuous prompts using task-specific training data. Despite the performance improvements on downstream tasks, several studies have reported that CoOp suffers from the overfitting issue in two aspects: (i) the test accuracy on base classes first gets better and then gets worse during training; (ii) the test accuracy on novel classes keeps decreasing. However, none of the existing studies can understand and mitigate such overfitting problem effectively. In this paper, we first explore the cause of overfitting by analyzing the gradient flow. Comparative experiments reveal that CoOp favors generalizable and spurious features in the early and later training stages respectively, leading to the non-overfitting and overfitting phenomenon. Given those observations, we propose Subspace Prompt Tuning (SubPT) to project the gradients in back-propagation onto the low-rank subspace spanned by the early-stage gradient flow eigenvectors during the entire training process, and successfully eliminate the overfitting problem. Besides, we equip CoOp with Novel Feature Learner (NFL) to enhance the generalization ability of the learned prompts onto novel categories beyond the training set, needless of image training data. Extensive experiments on 11 classification datasets demonstrate that SubPT+NFL consistently boost the performance of CoOp and outperform the state-of-the-art approach CoCoOp. Experiments on more challenging vision downstream tasks including open-vocabulary object detection and zero-shot semantic segmentation also verify the effectiveness of the proposed method. Codes can be found at https://tinyurl.com/mpe64f89.

translated by 谷歌翻译

Pangu-Weather: A 3D High-Resolution Model for Fast and Accurate Global Weather Forecast

Kaifeng Bi , Lingxi Xie , Hengheng Zhang , Xin Chen , Xiaotao Gu , Qi Tian

分类：人工智能 | 计算机视觉 | 机器学习

2022-11-03

In this paper, we present Pangu-Weather, a deep learning based system for fast and accurate global weather forecast. For this purpose, we establish a data-driven environment by downloading $43$ years of hourly global weather data from the 5th generation of ECMWF reanalysis (ERA5) data and train a few deep neural networks with about $256$ million parameters in total. The spatial resolution of forecast is $0.25^\circ\times0.25^\circ$, comparable to the ECMWF Integrated Forecast Systems (IFS). More importantly, for the first time, an AI-based method outperforms state-of-the-art numerical weather prediction (NWP) methods in terms of accuracy (latitude-weighted RMSE and ACC) of all factors (e.g., geopotential, specific humidity, wind speed, temperature, etc.) and in all time ranges (from one hour to one week). There are two key strategies to improve the prediction accuracy: (i) designing a 3D Earth Specific Transformer (3DEST) architecture that formulates the height (pressure level) information into cubic data, and (ii) applying a hierarchical temporal aggregation algorithm to alleviate cumulative forecast errors. In deterministic forecast, Pangu-Weather shows great advantages for short to medium-range forecast (i.e., forecast time ranges from one hour to one week). Pangu-Weather supports a wide range of downstream forecast scenarios, including extreme weather forecast (e.g., tropical cyclone tracking) and large-member ensemble forecast in real-time. Pangu-Weather not only ends the debate on whether AI-based methods can surpass conventional NWP methods, but also reveals novel directions for improving deep learning weather forecast systems.

translated by 谷歌翻译

Visual Recognition by Request

Chufeng Tang , Lingxi Xie , Xiaopeng Zhang , Xiaolin Hu , Qi Tian

分类：计算机视觉

2022-07-28

在本文中，我们提出了一种新颖的注释和评估方案，以进行视觉识别。与传统设置不同，该协议不需要标签/算法就可以立即注释/识别所有目标（对象，零件等），而是提出了许多识别说明，并且该算法通过请求识别目标。这种机制带来了两种有益的特性来减轻注释负担，即（i）可变粒度：不同的情况可以具有不同级别的注释，尤其是对象部分只能在大而清晰的实例中标记，（ii）被打开（ii） - 域：可以将新概念以最低的成本添加到数据库中。为了处理提出的设置，我们维护知识库并设计一个基于查询的视觉识别框架，该框架可以根据请求直接构建查询。我们在两个混合注销的数据集（CPP和ADE20K）上评估了识别系统，并演示了其从部分标记的数据中学习的有希望的能力，以及仅使用文本标签来适应新概念。

translated by 谷歌翻译